Improving the Performance of OLAP Queries Using Families of Statistics Trees
نویسندگان
چکیده
We present a novel approach to speeding up the evaluation of OLAP queries that return aggregates over dimensions containing hierarchies. Our approach is based on our previous version of CubiST (Cubing with Statistics Trees), which pre-computes and stores all possible aggregate views in the leaves of a statistics tree during a one-time scan of the data. However, it uses a single statistics tree to answer all possible OLAP queries. Our new version remedies this limitation by materializing a family of derived trees from the single statistics tree. Given an input query, our new query evaluation algorithm selects the smallest tree in the family which can provide the answer. Our experiments have shown drastic reductions in processing times compared with the original CubiST as well as existing ROLAP and MOLAP systems.
منابع مشابه
CUBIST: A NEW APPROACH TO IMPROVING THE PERFORMANCE OF AD-HOC CUBE QUERIES By LIXIN FU A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
suggestions. Last but not least, I extend my utmost gratitude to my wife Jia Liu, my mother-in-law Sun Xinhua, and my son Andrew Fu for their enduring support. We provide a new approach to speeding up the evaluation of cube queries, an important class of OLAP queries which return aggregated values rather than sets of tuples. Our new algorithm called CubiST (Cubing with Statistics Trees) represe...
متن کاملCUBIST: A New Approach to Speeding Up OLAP Queries
We report on a new, efficient encoding for the data cube, which results in a drastic speed-up of OLAP queries that aggregate along any combination of dimensions over numerical and categorical attributes. Specifically, we introduce a new data structure, called Statistics Tree (ST), together with an algorithm, called CubiST (Cubing with Statistics Trees), for evaluating OLAP queries on top of a r...
متن کاملConstruction of Decision Trees Using Data Cube
Data classification is an important problem in data mining. The traditional classification algorithms based on decision trees have been widely used due to their fast model construction and good model understandability. However, the existing decision tree algorithms need to recursively partition dataset into subsets according to some splitting criteria i.e. they still have to repeatedly compute ...
متن کاملPhysical Data Warehouse Design on NoSQL Databases - OLAP Query Processing over HBase
Nowadays, data warehousing and online analytical processing (OLAP) are core technologies in business intelligence and therefore have drawn much interest by researchers in the last decade. However, these technologies have been mainly developed for relational database systems in centralized environments. In other words, these technologies have not been designed to be applied in scalable systems s...
متن کاملImproving Query Performance on OLAP-Data Using Enhanced Multidimensional Indices
Multidimensional indices are efficient to improve the query performance on OLAP data. As one multidimensional index structure, R*-tree is popular and successful, which is a member of the famous R-tree family. We enhance the R*-tree to improve the performance of range queries on OLAP data. First, the following observations are presented. (1) The clustering pattern of the tuples (of the OLAP data...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001